Efficient Index-Based Audio Matching
Identifieur interne : 000787 ( Main/Exploration ); précédent : 000786; suivant : 000788Efficient Index-Based Audio Matching
Auteurs : Frank Kurth [Allemagne] ; Meinard Müller [Allemagne]Source :
- IEEE transactions on audio, speech and language processing [ 1558-7916 ] ; 2008.
Descripteurs français
- Pascal (Inist)
- Base de données audio, Document musical, Enregistrement son, Identification système, Immunité bruit, Traitement signal audio, Compression signal, Artefact, Similitude, Requête, Présentation de la ligne apellante, Base de données, Analyse sémantique, Algorithme, Tolérance faute, Indexation, Traitement signal acoustique, Présentation ligne appelante.
- Wicri :
- topic : Base de données.
English descriptors
- KwdEn :
Abstract
Given a large audio database of music recordings, the goal of classical audio identification is to identify a particular audio recording by means of a short audio fragment. Even though recent identification algorithms show a significant degree of robustness towards noise, MP3 compression artifacts, and uniform temporal distortions, the notion of similarity is rather close to the identity. In this paper, we address a higher level retrieval problem, which we refer to as audio matching: given a short query audio clip, the goal is to automatically retrieve all excerpts from all recordings within the database that musically correspond to the query. In our matching scenario, opposed to classical audio identification, we allow semantically motivated variations as they typically occur in different Interpretations of a piece of music. To this end, this paper presents an efficient and robust audio matching procedure that works even in the presence of significant variations, such as nonlinear temporal, dynamical, and spectral deviations, where existing algorithms for audio identification would fail. Furthermore, the combination of various deformation- and fault-tolerance mechanisms allows us to employ standard indexing techniques to obtain an efficient, index-based matching procedure, thus providing an important step towards semantically searching large-scale real-world music collections.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000011
- to stream PascalFrancis, to step Curation: 000001
- to stream PascalFrancis, to step Checkpoint: 000009
- to stream Main, to step Merge: 000787
- to stream Main, to step Curation: 000787
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Efficient Index-Based Audio Matching</title>
<author><name sortKey="Kurth, Frank" sort="Kurth, Frank" uniqKey="Kurth F" first="Frank" last="Kurth">Frank Kurth</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Research Institute for Communication, Information Processing and Ergonomics (FKIE), Research Establishment for Applied Scenice (FGAN)</s1>
<s2>53343 Wachtberg</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>53343 Wachtberg</wicri:noRegion>
<wicri:noRegion>Research Establishment for Applied Scenice (FGAN)</wicri:noRegion>
<wicri:noRegion>53343 Wachtberg</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Max-Planck Institut für Informatik, Department D4-Computer Graphics</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">08-0185407</idno>
<date when="2008">2008</date>
<idno type="stanalyst">PASCAL 08-0185407 INIST</idno>
<idno type="RBID">Pascal:08-0185407</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000011</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000001</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000009</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000009</idno>
<idno type="wicri:doubleKey">1558-7916:2008:Kurth F:efficient:index:based</idno>
<idno type="wicri:Area/Main/Merge">000787</idno>
<idno type="wicri:Area/Main/Curation">000787</idno>
<idno type="wicri:Area/Main/Exploration">000787</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Efficient Index-Based Audio Matching</title>
<author><name sortKey="Kurth, Frank" sort="Kurth, Frank" uniqKey="Kurth F" first="Frank" last="Kurth">Frank Kurth</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Research Institute for Communication, Information Processing and Ergonomics (FKIE), Research Establishment for Applied Scenice (FGAN)</s1>
<s2>53343 Wachtberg</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<wicri:noRegion>53343 Wachtberg</wicri:noRegion>
<wicri:noRegion>Research Establishment for Applied Scenice (FGAN)</wicri:noRegion>
<wicri:noRegion>53343 Wachtberg</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Max-Planck Institut für Informatik, Department D4-Computer Graphics</s1>
<s2>66123 Saarbrücken</s2>
<s3>DEU</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="2">Sarre (Land)</region>
<settlement type="city">Sarrebruck</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">IEEE transactions on audio, speech and language processing</title>
<idno type="ISSN">1558-7916</idno>
<imprint><date when="2008">2008</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">IEEE transactions on audio, speech and language processing</title>
<idno type="ISSN">1558-7916</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Acoustic signal processing</term>
<term>Algorithm</term>
<term>Artefact</term>
<term>Audio databases</term>
<term>Audio signal processing</term>
<term>Calling line identification presentation</term>
<term>Database</term>
<term>Fault tolerance</term>
<term>Indexing</term>
<term>Musical score</term>
<term>Noise immunity</term>
<term>Query</term>
<term>Semantic analysis</term>
<term>Signal compression</term>
<term>Similarity</term>
<term>Sound recording</term>
<term>System identification</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Base de données audio</term>
<term>Document musical</term>
<term>Enregistrement son</term>
<term>Identification système</term>
<term>Immunité bruit</term>
<term>Traitement signal audio</term>
<term>Compression signal</term>
<term>Artefact</term>
<term>Similitude</term>
<term>Requête</term>
<term>Présentation de la ligne apellante</term>
<term>Base de données</term>
<term>Analyse sémantique</term>
<term>Algorithme</term>
<term>Tolérance faute</term>
<term>Indexation</term>
<term>Traitement signal acoustique</term>
<term>Présentation ligne appelante</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Base de données</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Given a large audio database of music recordings, the goal of classical audio identification is to identify a particular audio recording by means of a short audio fragment. Even though recent identification algorithms show a significant degree of robustness towards noise, MP3 compression artifacts, and uniform temporal distortions, the notion of similarity is rather close to the identity. In this paper, we address a higher level retrieval problem, which we refer to as audio matching: given a short query audio clip, the goal is to automatically retrieve all excerpts from all recordings within the database that musically correspond to the query. In our matching scenario, opposed to classical audio identification, we allow semantically motivated variations as they typically occur in different Interpretations of a piece of music. To this end, this paper presents an efficient and robust audio matching procedure that works even in the presence of significant variations, such as nonlinear temporal, dynamical, and spectral deviations, where existing algorithms for audio identification would fail. Furthermore, the combination of various deformation- and fault-tolerance mechanisms allows us to employ standard indexing techniques to obtain an efficient, index-based matching procedure, thus providing an important step towards semantically searching large-scale real-world music collections.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
<region><li>Sarre (Land)</li>
</region>
<settlement><li>Sarrebruck</li>
</settlement>
</list>
<tree><country name="Allemagne"><noRegion><name sortKey="Kurth, Frank" sort="Kurth, Frank" uniqKey="Kurth F" first="Frank" last="Kurth">Frank Kurth</name>
</noRegion>
<name sortKey="Muller, Meinard" sort="Muller, Meinard" uniqKey="Muller M" first="Meinard" last="Müller">Meinard Müller</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000787 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000787 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Sarre |area= MusicSarreV3 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:08-0185407 |texte= Efficient Index-Based Audio Matching }}
This area was generated with Dilib version V0.6.33. |